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Abstract: Data are presented on the effects of Animated Agents on multimedia learning environments with 
specific concerns of split attention and modality effects. The study was a 3 (agent properties: agent only, 
agent with gestures, no agent) x 3 (picture features: static picture, sudden onset, animation) factorial design 
with outcome measures of mental load rating scale, a persona rating scale, multiple-choice questions, a 
matching test, a retention test, and transfer tests involving creative solutions. Overall, there were no split 
attention or modality effects found with integrating the agent into the display. 
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Interest in the use of animated pedagogical agents in instructional design involving multimedia in 
virtual learning environments has increased recently, as new technologies have made them more accessible 
(Craig, Hu, Marks, & Graesser, 1999; Johnson, Rickel, & Lester, 2000). An animated pedagogical agent is 
a computerized character (either humanlike or otherwise) that can interact with a user in order to impart 
some type of information. Because these agents are relatively new, there has been little research into their 
proper construction, capabilities, use, or limitations. 

Animated agents can be seen as logical extensions of the development and customization of new 
learning interfaces. A line of research conducted by Byron Reeves and Clifford Nass (1996) provides a 
context for exploring these interfaces. They put forth the basic principles of what they call the "Media 
equation theory." This model holds that people naturally interact with various forms of media in the same 
ways they interact with other people. In this context, media can include anything from written text to 
television to computer programs. 

If people tend to anthropomorphize media such as computers and the programs that run on them, 
there may be real advantages to implementing pedagogical agents in computer interfaces. First, agents 
increase the bandwidth of communication by the addition of a direct conversational partner, a partner who 
is potentially capable of showing varied emotional states and patterns of deixis. Second, agents may 
increase the computer's ability to engage learners and motivate them. Furthermore, appropriate lifelike 
behaviors make agents appear knowledgeable, attentive to the learner, and helpful (Johnson et al., 2000). 

Animated pedagogical agents would seem to provide a challenge to multimedia environments, 
given the precepts of the cognitive theory of multimedia learning (Moreno & Mayer, 1999). If the agents 
are integrated as a part of an illustration or animation, their presence could cause split-attention effects 
(Sweller & Chandler, 1994), or modality effects (Moreno & Mayer, 1999). Modality effects could result 
from both the agent and learning materials being presented in the visual modality. Such effects might not 
be overcome entirely by the integration of spoken text with a picture or animation, because learners might 
concentrate on the agent and ignore the learning materials. What might be required is to direct attention 
away from the agent who provides the spoken text and toward the appropriate visual materials, for example 
gesturing by the agent (Johnson et al., 2000) or attention capture within the animation itself (Yantis & 
Hillstrom, 1994). There is currently evidence that animated pedagogical agents used in virtual learning 
environments can promote baseline problem solving skills (Johnson et al., 2000; Moreno, 2001; Moreno, 
Mayer, Spires, & Lester, 2001). However, it is not clear what role agents play in learning environments 
(Andre et al., 1999). 
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One of the first studies (Lester, Voerman et al, 1997; Lester, Towns, Fitzgerald, 1999) of 
animated agents led to what is called the "Persona effect." The claim is that the presence of a lifelike 
character in the environment has a positive impact on the learner's interactive experience. The study also 
revealed that more expressive agents are given higher ratings on clarity and utility than the less expressive, 
Herman the bug was the agent in the study. Participants gave ratings on how helpful Herman was as an aid 
to their learning experience. However, it is important to note that the Persona effect is silent on whether 
participants learn more when interacting with agents (Lester, Voerman et al,, 1997), It only states that 
learners enjoy the experience more. 

The present study was designed to investigate issues related to attention by manipulating ^ent 
properties and features of the pictorial information. Two possibilities present themselves in this context. 
First, consider the agent, deictic gesture can be used to direct the learner's attention while integrating the 
animated agent with a picture or animation. Pointing and gesturing are a natural way in which both adults 
and children attempt to direct attention (Alibali & DiRusso, 1999; Krauss, 1998), Gestures should occur 
simultaneously or prior to the onset of the speech act that they signify (Moirel, & Krauss, 1992), A second 
option is to capture attention by using parts of a picture or animation itself. According to attention research, 
an excellent way capture attention is by an abrupt onset and motion (Jonides & Yantis, 1988; Yantis & 
Hillstrom, 1994). 

The design was a 3 (agent properties: agent only, agent with gestures, no agent) x 3 (picture features: 
static picture, sudden onset, animation) factorial. If the use of an agent leads to split attention effects and if 
these effects are reduced by agent gesture, then we would expect to see the no agent condition > the agent 
condition < agent with gesture condition. If there were no split attention effect, then no differences would 
be seen between agent properties. The cognitive theory of multimedia learning would predict that within 
the picture features animation > static picture. Also, within the picture features sudden onset > static 
picture if attention capture is sufficient for learning. 

Methods 

Participants 

Participants in this experiment were 135 students drawn from an undergraduate psychology students 
at the University of Memphis who volunteered from a pool of participants. This pool consisted of all 
students taking either of two levels of introductory psychology courses. 

Materials 

The materials for the experiment were of two kinds; computerized materials and pencil and paper. 
The computerized materials consisted of the visual and narrative information presentation (training 
section). The pencil and paper materials consisted of a questionnaire for domain knowledge, a test of 
spatial ability, a mental load rating scale, a persona rating scale, multiple -choice questions, a matching test, 
a retention test, and transfer tests involving creative solutions. 

The computerized materials were aeated using three different computer application packages. The 
agent and voice were created using the Microsoft Agent software package (Microsoft, 1998), The 
multimedia animations were created using Macromedia Flash 3,0 (Macromedia, 1998), These packages 
were integrated using a Program called Xtrain (Hu, 1998; Hu & Craig, 2000), 

Likert-type scales were used for the Persona test and mental load rating. The persona test ranged 
from 1 to 6 with 1 being extremely enjoyable and 6 being extremely not enjoyable. The scale was similar 
to those that were described in previous research literature (Johnson et al,, 2000; Lester et al., 1997). The 
mental load rating ranged from 1 to 6 with 1 being extremely easy and 6 being extremely difficult. This 
subjective rating has been used in previous research as a measure of the cognitive load of a task (Kalyuga, 
Chandler, & Sweller, 1999; Paas & Van Merrienboer, 1993, 1994), 

Participants also received two pencil and paper tests at the outset of the experiment session. Both of 
these tests were brief and were given prior to the multimedia information presentation. The tests were for 
domain knowledge and spatial ability. 

The test of domain knowledge was a standard screening test used in related research (Mayer & 
Moreno, 1998; Moreno & Mayer, 1999), This questionnaire consisted of a seven item activity checklist 
containing statements concerning weather knowledge, with one point added for each checked item and a 
five level-self assessment from less than average domain knowledge (1) to very much (5), The cut-off 
criterion adopted was six (out of a maximum 11) in order to be consistent with related research (Mayer, 
1997; Mayer & Moreno, 1998; Moreno & Mayer, 1999). The test of spatial ability was a standard paper- 
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folding task with scores used as a covariate. In the task, participants were simply given twelve minutes to 
correctly answer as many questions as they could (Bennet, Seashore, & Wesman, 1972). 

The information presentation was concerned with the process of lightning firmation. These 
materials have been shown to be effective for achieving learning gains in previous research (Mayer & 
Moreno, 1998; Moreno & Mayer, 1999). The scenario presented followed a causal path from how a storm 
front forms to the creation and display of lightning. 

The three remaining tests have been used in previous research (Mayer & Moreno, 1998; Moreno & 
Mayer, 1999). These involved retention, matching, and transfer. The retention test consisted of one 
question, "Please write down an explanation of how lightning works" (See Appendix D for example). 
These tests were collected after five minutes. The matching test consisted of four frames with instructions 
that ask the participants to circle and label the cool moist air, warmer surface, updraft, freezing level, 
downdraft, gusts of cool wind, stepped leader, and the return stroke. These were collected after three 
minutes. The test for creative solutions consists of four questions presented one at a time for three minutes 
each. 

In addition to the tests used by Mayer, participants were also presented with a series of six multiple - 
choice questions requiring a forced choice among four possibilities, one correct with three other foils. 
These six questions assessed three categories of knowledge (explicit shallow, explicit deep, and implicit 
deep). The explicit shallow questions focused on the shallow surface level information from the 
presentation (e.g.. The upper portion of the cloud is made up of what?). The explicit deep questions focus 
on understanding of the concepts what were presented in the information delivery (e.g.. When do 
downdrafts occur?). The Implicit deep questions focus on the application of the underlying concepts to 
problems that were not focused on in the information delivery (e.g.. Why does it get colder right before it 
rains?). 

Procedure 

The basic procedure was as follows. When the participants first entered the laboratory, they were 
issued a packet of materials. This packet contained their informed consent, test of domain knowledge, and 
the test of spatial ability. After these were completed, those eligible for the study (those scoring under half 
on the domain knowledge test) received instructions for part two of the experiment. They were presented 
with the information delivery, which took about three minutes. Afterward, they were given the retention 
question (5 minutes), the Multiple -choice questions (2 minutes), the matching test (3 minutes), and 4 
transfer questions (3 minutes each). 

Results and Discussion 

Persona effect 

A 3 (agent properties; no agent, agent, agent with gesture) x 3 (picture features: picture, onset, 
motion) ANOVA was performed on the persona data. There was no evidence for a persona effect in this 
study. There could be several explanations for this. Participants were only exposed to one condition and 
thus, they had nothing to make a comparison with. If this is the case, a within subject design would be an 
accurate measure. Also, the agent was displayed for only 180 seconds and that might not have been long 
enough to produce an effect. Since the persona effect is based on the agent making the learning experience 
more enjoyable (Johnson et al., 2000), an agent interaction within the learning environment for 180 seconds 
was probably not enough to produce an effect. 

Even though the persona effect (Andre et al., 1999; Lester, Convers et al., 1997) was not 
significant, a trend can be seen in the data. In the scale, a lower score indicated a more positive rating. The 
means ratings for the three groups were M = 3.44 for no agent present, M = 3.42 for agent present, andM = 
3.07 for agent with gestures with lower numbers indicating a more positive rating. A Cohen’s f effect size 
was calculated for the persona data. These analyses yielded effect size score of .42 in the agent-with- 
gestures vs. no -agent comparison, and a score of .02 in the agent-only vs. no-agent comparison. This 
shows a supportive trend for the persona effect. 

Cognitive Load 

A 3 (agent properties: no agent, agent, agent with gesture) x 3 (picture features: picture, onset, 
motion) ANOVA was performed on the cognitive load data (Paas & Merrienboer, 1993). It yielded a 
significant effect between the picture features, (P (2,126) = 3.737, £ < .05). A post hoc test performed on 
the three picture features groups yielded significant differences in perceived comprehension ratings 




between the picture condition (M = 3.62) and the motion condition (M = 3.09), < .05). This finding is 

in line the cognitive theory of multimedia learning. It predicts a decrease in the difficulty of 
comprehension for the motion condition over the picture condition when synchronization of the display is 
attained that ensures temporal and spatial contiguity (Moreno & Mayer 1999). 

Matching 

A 3 (agent properties: no agent, agent, agent with gesture) x 3 (picture features: picture, onset, 
motion) ANOVA was performed on the matching task data. It yielded a significant effect only for picture 
features, F(2,133) = 12.434, p. < .001. Tukey contrasts revealed that both onset (M = 4.42) and motion (M 
= 4.56) conditions performed significantly better than the picture condition (M = 3.13, p < .001), but that 
the onset and motion conditions do not differ from each other. 

Although these results differ from previous results tiat did not find differences in the matching 
data (Mayer, 1997; Moreno & Mayer, 1999), they support the cognitive theory of multimedia learning. 
According to this theory, in order for successful learning to occur, there must be an integration of the 
verbally based and the visually based models of the material (Mayer, 1984). This integration may be 
enhanced by the use of animations over pictures (Moreno & Mayer, 1999). Both the motion and the onset 
conditions provided the verbal and visual integration reeded for matching in the present study. The 
difference was possibly due to the decreased presentation time that prevented the ceiling effects found in 
previous research by Moreno and Mayer (1999). 

Retention 

A 3 (agent properties: no agent, agent, agent with gesture) x 3 (picture features: picture, onset, 
motion) ANOVA was performed on the retention data. It yielded a significant effect only for Picture 
features, F(2, 133) = 25.73, p < .001. Tukey contrasts yielded a difference between the picture condition 
(M = 1.93) and both the onset (M = 4.20) and motion (M = 5.07) conditions. There was no difference 
between the latter two groups. 

Transfer 

The transfer task probed the extent to which participants applied the concepts they learned to other 
problems and exhibited creative solutions. A 3 (agent properties: no agent, agent, agent with gesture) x 3 
(picture features: picture, onset, motion) ANOVA was performed on the transfer task data. This analysis 
yielded a significant effect of picture features only, F(2,133) = 4.03, p < .05. Tukey contrasts yielded a 
difference between the onset (M = 2.13) and picture (M = 1 .44) conditions (p < .05). The motion condition 
(M - 1.89) was intermediate and did not differ from either of the other two groups. 

The differences between the onset and picture conditions provides support for claim the that a sudden onset 
of a color singleton directs attention as needed to assist the construction of mental models that facilitate the 
implicit inferences necessary to construct creative solutions. 

Multiple -Choice Questions 

A 3 (agent properties: no agent, agent, agent with gesture) x 3 (picture features: picture, onset, 
motion) ANOVA was performed on the total score obtained from the multiple-choice questions. This 
yielded a significant difference only between picture features, F(2, 1 33) = 7.58, p < .00 1 . Both the onset (M 
= 2.13) and motion (M = 1.89) groups significantly outperformed the picture (M = 1.44) group (p < .001). 
There was no difference between onset and motion conditions. 

Explicit Shallow. The multiple -choice questions attempted to evaluate three types of knowledge 
(explicit shallow, explicit deep, and implicit deep). A 3 (agent properties: no agent, agent, agent with 
gesture) x 3 (picture features: picture, onset, motion) ANOVA performed on the data from the questions 
that tapped explicit shallow knowledge yielded only a significant effect of picture features, F(2, 133) = 
3.08, p < .05. Tukey contrasts showed that participants in the motion condition (M = 1 .64) outperformed 
the picture condition (M = 1.36), (^< .05). This result is consistent with findings from the matching task, 
which tested for shallow knowledge. 

Explicit Deep. A 3 (agent properties: no agent, agent, agent with gesture) x 3 (picture features: 
picture, onset, motion) ANOVA was performed on the data for the explicit deep questions. It yielded only a 
significant effect for the picture features, F(2,133) = 7.93, p<.001. Tukey contrasts revealed differences 
between both the onset condition (M = 71) and the motion condition ^ = .79) when compared to the 
picture condition ^ = .29). The participants in the motion condition outperformed those in the picture 
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condition < .001). Similarly, those in the onset condition outperformed participants in the picture 
condition (p < .01). These findings suggest that by directing attention appropriately the onset of the color 
singleton (Yantis & Hillstrom, 1994) facilitated deeper learning of core concepts as effectively as an 
animation with motion in the present relatively brief presentation. This would seem to indicate that the 
onset provided the temporal contiguity that, according to Moreno and Mayer (1999), was required to get 
full integration of the verbal and visual representations. 

Implicit Deep. The final multiple -choice questions were designed to tap implicit deep knowledge. 
A 3 (agent properties: no agent, agent, agent with gesture) x 3 (picture features: picture, onset, motion) 
ANOVA that was performed on the data yielded no significant effects, but picture features was marginally 
significant, F(2,133) = 2.738, p = .068. Furthermore, the means (picture M = .96, onset M = 1.20, motion 
M = .91) were in a somewhat similar direction as the transfer data, which attempted to tap the same pool of 
knowledge. 

The means and standard deviations for all conditions and measures are presented in Table 1 below. 





Persona 

Test 


Matching 

Test 


Retention 

Question 


Transfer 

Questions 


Multiple 

Choice 

(Total) 


Explicit 

Shallow 


Explicit 

Deep 


Implicit 

Deep 


Agent Properties 


No agent 


3.44 


4.04 


3.91 


1.69 


3.40 


1.56 


0.67 


1.18 


Agent Only 


3.42 


4.04 


3.80 


1.76 


2.96 


1.53 


0.49 


1.00 


Agent w/ 
Gesture 


3.07 


4.02 


3.49 


2.02 


3.02 


1.52 


0.59 


0.89 


Picture features 


Picture 


3.62 


3.13 


1.93 


1.44 


2.60 


1.36 


0.29 


0.96 


Onset 


3.24 


4.42 


4.20 


2.13 


3.47 


1.56 


0.71 


1.20 


Motion 


3.09 


4.56 


5.07 


1.89 


3.31 


1.64 


0.79 


0.91 



Table 1. Table of Means and standard deviations 



Summary and conclusions 

The study revealed several findings. First, there were no differences due to agent properties and, 
thus, there was no evidence of split-attention effects. The presence of the agent in the learning environment 
was not detrimental to learning. It appears that agents can be safely integrated into brief multimedia 
presentations without fear of interference, as proposed by Johnson et al (2000). Second, most of the 
analyses revealed differences among the image-type conditions that supported the cognitive theory of 
multimedia learning (Moreno & Mayer, 1999). Close temporal synchronization of the narration with the 
animated display enhanced learning, presumably by establishing relationships between the visual and 
verbal representations. Third, the findings suggest stimulus onset is just as effective and in some cases 
more effective in directing attention to appropriate parts of the display as motion in a fully animated 
display. 
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